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ABSTRACT 


One of the greatest challenges for human beings is to 
perceive the future so that we can get ourselves prepared for it The 
futuie of a process oi a phenomenon depends on the past 
observations which are used to construct the tune series forecastmg 
model Traditionally statistical models or stochastic models were 
employed to model a time series The recent trend is applymg 
Artificial Neuial Network methods 

In tins present work a comparison of the performances of 
the statistical and Neural Network methods for tune series forecastmg 
IS presented for some classical problems Box Jenkms approach is 
used for the statistical modellmg The Neural Network models studied 
are Back Propagation through time and Time delayed Neural 
Networks 
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chapter 1 

INTRODUCTION 


1 1 Introduction 

It IS human nature to know in advance what is likely to happen in the future Observing 
past outcomes of a phenomenon m order to anticipate its future behaviour represents the 
essence of forecasting or prediction If a complete mathematical model descnbmg a 
studied phenomenon is known and not very complex and if the initial conditions are 
sufficiently defined forecasting becomes a tnvial task But when an analytical model is 
unknown too complex then a typical alternative is to try to forecast by building a model 
that takes into account only previous outcomes of the phenomenon while ignonng any 
exterior influence 

Forecasting is predicting the short term evolution of the system or a phenomenon 
Forecasting a system or a natural phenomenon is of utmost importance They help in 
planning and also m preventing a forecasted disaster Some of the major applications of 
forecasting include forecasting weather rainfall natural calamities sales and stock 
market population electric load demand 

The outcomes of the phenomenon over time form a time-series A time-senes can 
be defined as a function x of an independent vanable t form a process for which a 
mathematical description is unknown Time series prediction problems are approached 
either from a stochastic perspective or more recently from a neural network perspective 
Each of these methods has then own advantages and disadvantages 

Statistical Methods 

The statistical models include models such as Auto regressive (AR) Moving Average 
(MA) or the combination of the two Auto Regressive Moving Average (ARMA) model 
These models have limited applicability as they commonly employ linear models The 
advantage of these models is that they are fast The commonly used statistical model in 
literature is that proposed by Box-Jenkms' 
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Neural Network Methods 

Among the various potential applications of neural networks forecasting is considered to 
be a major application The Neural Network models are powerful with regard to the 
accuracy of prediction 

The Back Propagation algorithm is the most popular method for the design of 
neural networks However it doesn t incorporate the dynamical behaviour which is a 
must for the forecasting problems Some Neural Network models which incorporate this 
dynamical property to it includes Time Delay Neural Networks Temporal back 
propagation Back propagation through tune This work employs two Neural Network 
methods for forecasting a time senes Time Delay Neural Networks® and Back 
propagation through time^ A statistical method is also used to model the time senes and 
Its accuracy of prediction compared with the Neural Network methods 

1 2 Problem Definition 

This thesis compares iho, foi ecastmg performance of a time series using two approaches 
the traditional Box Jenkins statistical method and the Neural Network methods to a time- 
senes model The two Neural Network models discussed are back propagation through 
time and time delay neural networks A companson of the two approaches is presented 
by taking some classical problems of time series modelling 

1 3 Organisation of the thesis 

A detailed discussion of the statistical approach of the Box Jenkins model is presented in 
chapter 2 wherein all the steps of model building are descnbed It describes the different 
stages of model building using Box Jenkins method Chapter 3 has the concepts and the 
algorithms of back propagation through time and Time delay Neural Networks Results 
of all the above methods for some classical examples like the sunspots senes save rate 
data are presented in Chapter 4 In the end conclusions and scope for future work are 
briefed m chapter 5 
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Chapter 2 

BOX- JENKINS METHOD 


An important part of statistics deals with the analysis of data that are collected 
sequentially over time or time series data There are two objectives of analyzmg time 
senes data One to model the histoncal data and two to forecast or predict future 
values of the senes The techmque that is discussed here is that of a single senes 
which means that the model and forecasts are based only on past values of the 
variable being forecast These models are termed as UBJ (Univanate Box Jenkins) 
and are also referred to as UBJ ARIMA ARIMA stands for Auto Regressive 
Integrated Movmg Average 

All statistical forecasting methods are extrapolative m nature they involve the 
projection of past patterns or relationships into the future In the case of UBJ ARIMA 
forecasting we extrapolate past patterns withm a single data senes into the future In 
other words the model is an algebraic statement telling how one thing is statistically 
related to one or more other things 



Fig 2 1 The idea of forecasting 
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2 1 Smgle-senes (univariate) analysis 

Time senes analysis is used to explain the behavior of time senes data usmg only past 
observations on the vanable in question The vanous observations in a time 
sequenced data ( Zti zt zi+i ) he statistically dependent "ThQ 

concept of correlation is used to measure the relationships between observations 
withm the series Figure shows the idea of UBJ forecasting 

2 2 When may UBJ model be used‘s 
Short teim forecasting 

UBJ ARIMA models are especially suited to short term forecastmg because most 
ARIMA models place heavy emphasis on the recent past rather than the distant past 
The long tern forecasts are less reliable because the observations are not available and 
they themselves are predicted and hence less reliable 
Data types 

The UBJ method applies to either discrete data or continuous data The data must be 
equally spaced at discrete time mtervals 
Sample size 

Buildmg an ARIMA model requires an adequate sample size atleast 50 observations 
Stationaiy senes 

The UBJ ARIMA method applies only to statwnaiy series which has a mean 
variance and autocorrelation function that are essentially constant through time 

2 3 Statistical terms used 
2 3 1 Diffeiencing 

Non stationary senes senes for which the mean changes over tune can be 
transformed mto stationary ones by employmg differencmg 

Wt = z t- 2 t-i r =2 3 n (21) 

The senes Wt is called the fust diffeiences of Zt If the first differences of zt do 
not have a constant mean then Wtis redefined as the first differences 

Wf = (Zf-z/;)-(Zf/-Zf7) t-3 4 n (2 2) 

Series Wt is now called the second differences of z? 
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2 3 2 Deviations fi om the mean 

To focus on th© stochastic behavior of the stationaiy senes the data is expressed in 
deviations from the mean i e we define a senes The two senes Zt and f, have all 
the same statistical properties except for their means 

2 3 3 Estimated autocoi relation functions 

Autocorrelation coefficient means calculating the correlation coefficient between sets 
of ordered pairs (z, z, j) These autocorrelation coefficients when plotted graphically 

IS known as autocorrelation function (acf) It measures the direction and strength of 
the statistical relationship between ordered pairs of observations on two random 
variables and can take values between —1 and +1 A value of 1 means perfect 
negative correlation, a value of +1 means perfect positive correlation and a value of 0 
denotes uncorrelated 

The standard formula for calculating autocorrelation coefficients is 


£(z^-z)(z, ;t-2) 

11 (2 3 ) 


2 3 4 Estimated paidial autocon elation funtions 

The estimated partial autocorrelation fimction(pacO is broadly similar to an estimated 
acf It IS used as a guide along with the estimated acf m choosing one or more 
ARIMA models that might fit the available data The idea of partial autocorrelation 
analysis is that we want to measure how z, and z^^^. are related but with the effects of 

the mtervening z s accounted for The partial autocorrelation coefficients are found by 
applymg regression techmques 

A computationally easier way to estimate (pkk coefficients is 
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(k=2 3 ) 


k 1 




~Y.<i^k-x/k , 


j-i 


Lk 


k 1 


j 1 


(2 5) 


where 


~ ^k-\ ] ^lOc^k life; (A' - 34 J-12 ^1) 


2 4 AUTOREGRESSIVE INTEGRATED MOVING AVERAGE 
(ARIMA) MODELS 

There are three common ARIMA processes m literature They are 

1 Autoregressive (AR) Model 

2 Moving Average (MA) Model 

3 Autoregressive Movmg Average (ARMA) Model 

The ARIMA model can be represented m terms of (p d q) notation, where p 
indicates the autoregressive order q the movmg average order and d the degree of 
dififerencmg necessary to achieve stationary 
Stationanty lequnement 

The UBJ ARIMA models are applicable to only those senes which are stationary If 
the time series is non stationary then it is made a stationary senes by differencmg 

Invertibihty lequuement 

This ensures that the smaller weights are assigned to observations that are further m 
past 

2 41 Autoi egressive Models 

In autoregressive pi ocesses of order p the current observation Zt is expressed as the 
sum of three components a Imear combmation of the p unmediate past observations 
a constant C and a random error component for the current penod The basic model is 
represented by ARIMA(1 0 0) or mathematically as 

Zt = (piZt 1+ ^2Zt2+ + ^pZtp + C + ai (2 6) 

where (jii <1)2 <j>p are the autoregressive parameters The term 
autoregressive is used smce Zt is regressed on observations from the same senes 
The model m Eq (2 6) can be rewritten as 
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(1 4BP)zt=C + a, (2 7) 

(,2^^ ,PpB^)=^(B) (2 8) 


Table 2 1 Summary of stationarity conditions for AR coefficients 


Model Type 

Stationarity Conditions 

ARMA(0 q) 

AR(1) or ARMA(1 q) 

AR(2) orARMA(2q) 

Always stationary 

Uil <1 
\<j)2\ <1 
(j)2+ 

(j)2 ^/ < 1 


2 4 2 Moving Average Models 

In moving average processes of order q the current observations Zt is the sum of a 
current and weighted lagged random error terms for the last q periods together with a 
constant C The ARIMA (0 0 q) or MA (q) process is determined by 


Zt=C at Qian 92ai2 dqUtq 

(2 9) 

or z, = C + (l diB e 2 B^ 0qB'^)at 

(210) 

where 


(1 OiB OiB^ 0gB'^)=0(B) 

(211) 


Table 2 2 Summary of mvertibility conditions for MA coefficients 


Model Type 

Invertibility Conditions 

ARMA(p 0) 

MA(1) or ARMA(p 1) 

MA(2) or ARMA(p 2) 

Always invertible 

01 < 1 

92 <l 

02-^ 01 <\ 

02 01 
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2 4 3 Autoregressive Moving Aveiage (ARMA) models 

A lime senes can have both autoregressive and moving average terms and so has both 
then features For example consider the ARM A(1 1) model given by 

Zf = C + ^/Zt 1 + <3, 6iat I (2 12) 

or (1 ^5/B)z, = C + (l eiB)at (213) 

The mixed autoregressive moving average process of order (p q) denoted by 
ARMA (p q) is represented by the equation 2 13 

2 5 The Box Jenkuis modehng procedure 

Box and Jenkms propose a practical three stage procedure for findmg a good model 
The broad outlme of modeling strategy is shown in the flow chart below The three 
stages are 

1 Identification 

2 Estimation 

3 Diagnostic checking 



Fig 2 2 Stages in Box Jenkins iterative approach to model buildmg 
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2 5 1 What IS a good model’ 

There is a difference between a model and a process A process is the true but 
unknown mechanism that has generated a realisation, while a model is only an 
imitation or representation of the process Because the process is unknown we never 
know if the selected model is the correct one The followmg are the charactenstics of 
a good model 

1 A good model is parsimonious 

2 A good AR model is stationary 

3 A good MA model is invertible 

4 A good model has high quality estimated coefficients at the estimation stage 

5 A good model has statistically independent residuals 

6 A good model fits the available data sufficiently well at the estimation stage 

7 Above all a good model has sufficiently small forecast eri ors 

2 6 Model Building 

2 61 Identification 

At this stage it is necessary to identify the values of {p d q) This identification is 
solely based on the exammation of the data We use two graphical devices the 
estimated acf and the estimated pacf to identify the underlymg model The basic idea 
m this identification is that eveiy ARIMA model has a theoietical acf and pacf 
associated with it At the identifiication stage we compare the estimated acf and pacf 
calculated from the available data with various theoretical acf s and pacf s We then 
tentatively choose the model whose theoretical acf and pacf most closely resemble the 
estimated acf and pacf of the data senes The statistical test such as the f tests or the 
chi squared test on the acf and the pacf is used to identify the tentative model 
Whichever model we choose at the identification stage we consider it only tentatively 
and It IS only a candidate for the final model 

Theontical acf and pacf for some common processes 

The major characteristics of theoretical acf s and pacf s for stationary AR, MA and 
mixed (ARMA) processes 
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Table 2 3 Primary distinguishing charactenstics of theoretical acf s and pacf s for 


stationary processes 


Process 

acf 

pacf 

AR 

Tails off to wards zero (exponential 

Cuts off to zero 


decay or damped sme wave ) 

( after lag p ) 

MA 

Cuts off to zero ( after lag q ) 

( exponential decay or 

damped sme wave ) 

Tails off toward zerr 

ARMA 

Tails off toward zero 

Tails off to ward zen 


<0 


pa 


K=Lag 


<!)i >0 


pacf 












Fig 2 5 Examples of theoretical acf and pacf for two MA(1) processes 
Table 2 4 Detailed charactenstics of some common stationary processes 


Process 

acf 

pacf 

AR(1) 

Exponential decay 

(a) on the positive side if 4)1 > 0 

(b) altematmg in sign on the negative 

Spike at lag 1 then cuts off 

to zero ( a ) spike is positive 

if {zS/ < 0 side if ^/ < 0 

AR(2) 

A mixture of exponential decays or 

a damped sine wave The exact pattern 

depends on the signs and sizes of 

and ^2 

Spikes at lags 1 and 2 then 

cuts off to zero 

MA(1) 

Spike at lag 1 then cuts off to zero 

(a) spike IS positive if < 0 

(b) spike is negative if if ^/ > 0 

Damps out exponentially 

(a) alternating m sign, startmg 

on the positive side 6 i <0 

(b) on the negative side if 0/>O 

MA(2) 

Spikes at lags 1 and 2 then cuts off 

zero 

A mixture of exponential decay to 

or a damped sme wave The exact 

pattern depends on the signs and 

sizes of 6 i and 62 

ARMA(1 1) 

Exponential decay from lag 1 

(a) sign of pi =sign of {(j)i 9i) 

(b) all one sign if > 0 

(c) altematiing in sign if < 0 

Exponential decay from lag 1 

(a) (jiii = pi 

(b) all one sign if >0 

(c) alternating in sign if ft < 0 
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2 6 2 Estimation 

Once a model is tentatively identified the parameters can be estimated by maximizing 
the corresponding likelihood function, assuming that the white noise tern is normally 
distributed At the estimation stage we get the precise estunates of a few parameters 
as we fit out tentative model to the data 

Box and Jenkins favor choosing coefificient estimates at the estimation stage 
according to the maximum likelihood (ML) cntenon But finding exact ML estimates 
can be computationally burdensome so Box and Jenkins suggest the use of least 
squares (LS) estimates If the random shocks are Normally distnbuted LS estimates 
are computationally easier to find and provide exactly or very nearly ML estimates 
LS estimates are those which give the smallest sum of squared residuals (SSR = Ufl) 

A residual {ai) is an estimate of a random shock {a^) It is defined as the 
difference between an observed value (zt) and a calculated value (z,) In practice the 
calculated values are found mserlmg estimates of the mean and the AR and MA 
coefficients mto the ARIMA model being estimated with the current random shock 
assigned its expected value of zero and applying these estunates to the available data 

Lmear least squares (LLS) may be used to estimate only pure AR models 
without multiphcative seasonal terms All other models require a nonlinear least 
squares (NLS) method 

The most commonly used NLS method is the combination of two NLS 
procedures Guass Newton linearization and the gradient method This combination is 
sometimes called as Marquardt s compromise (the algonthm is shown m Fig 2 8) 
Given some imtial estimates this algorithm chooses a senes of optimal coefficients 
corrections This method converges quickly to LS values in most cases The estimated 
results may be used to check a model for stationarity and mvertibihty 


15 


1 Specify starting values B 


2 Find initial sum of squred residuals(SSR q) 


8a SetSSRo=SSRi 
and^3 =Bi 


3 Fmd denvatives 


4 Fom hneanzed equations 


7a Reset parameter 

to move closer 
to gradient method 
corrections 


5 Apply linear least squares to euations formed 
at step 4 to find corrections h and new 


estimates B ^ = B h 


6 Fmd new sum of squared residuals(SSR^^ 


7 IsSSRi< SSRo 


8 Is(SSRo SSRj) /SSRq < e 


9 Assume convergence toLS estimates 



Fig 2 8 Flow chart for Marquardt s compromise for model estimation 
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2 6 2 Diagnostic Checking 

Once we have obtained the precise estimates of the coefficients in an ARIMA model 
the third stage in the UBJ procedure diagnostic checking is done At this stage we 
decide if the estimated model is statistically adequate Diagnostic checking is related 
to the identification stage in tvi o important ways First when the diagnostic checking 
shows that the model is inadequate then we must return to the identification stage to 
tentatively select one or more other models Second diagnostic checking also 
provides clues about how an madequate model might be reformulated 

The most important test of the statistical adequacy of an ARIMA model 
involves the assumption that the random shocks are independent If the random 
shocks are dependent or senally correlated it means that there is an auto correlation 
pattern m Zt that has not been accounted for by the AR and MA terms in that model 

The basic analytical tool at the diagnostic checking stage is the residual acf 
The calculation of residual acf is the same as that of the estimated acf with the onl\ 
difference that the residuals are taken mstead of the actual realisation After 
calculatmg the residual autocorrelations its standard error and t tests are performed 
Cntenon employed 

If the absolute value of a residual acfs t value is less than (roughly) 1 25 at lags 1 2 
and 3 and less than about 1 6 at larger lags we can conclude that the random shocks 
at that lag are mdependent This is just an approximate guess on the adequacy of the 
model 

2 6 4 Foiecasting 

The ultimate apphcation of UBJ ARIMA modehng is to forecast future values of a 
time senes After estimating the parameters and domg the diagnostic checking for the 
chosen model forecastmg is done The observations are available up to time t the 
forecast ongm Thereafter they must be replaced by their forecasted values Past at 
values are also replaced by their corresponding estimates or the estimated residuals 
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2 7 Simulation Results 


Example 1 Change m Business Inventories 

The testing MSE obtained for different orders chosen is tabulated below 
This variation of testing MSE with respect to the order chosen is plotted 
graphically 


Order 

Testing MSE 

1 

2 7401 

2 

2 7772 

3 

3 2569 

4 

3 0562 

5 

3 3678 

6 

3 5750 


Plot of the MSE Vs order of the model chosen 
3 

3 

3 

3 

3 

3 

2 
2 
2 



Order 
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MSE 


Example 2 Save Rate data 


The testing MSE obtamed for different orders chosen is tabulated below 
This variation of testing MSE with respect to the order chosen is plotted 
giaphically 


Order 

MSE 

1 

0 23221 

2 

0 21924 

3 

0 22098 

4 

0 22502 

5 

0 22627 

6 

0 22579 


Plot of the MSE Vs order of the model chosen 
0 234 I 1 1 1 r- 1 1 1 1 r 


0 232? 



3 35 4 

Order 
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Example 3 Sunspots data 


The testing MSE obtained for different orders chosen is tabulated below 
This variation of testing MSE with respect to the order chosen is plotted 
graphically 


Order 

MSE 

1 

51 4028 

2 

57 6942 

3 

53 8622 

4 

54 0858 

5 

59 8086 

6 

58 4127 


Plot of the MSE Vs order of the model chosen 



Order 
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Chapter 3 

NEURAL NETWORK METHODS 


3 1 Introduction 

One of the important and widely accepted applications of Neural Network is m the 
challenging field of forecasting a trend The highly complex and chaotic time senes can 
be model to a good amount of accuracy by some Neural Networks 

Static neural networks hke the simple back propagation are trained to produce a 
spatial output pattern in response to a particular spatial input pattern However in many 
engineenng scientific and economic apphcations the need anses to model dynamical 
processes where a time sequence is required in response to certain temporal input 
signal(s) The resulting model is referred to as a temporal association network Temporal 
associations must have a recurrent (as opposed to a static) architecture so as to handle the 
time dependent nature of associations Thus it would be very to extend the multilayer 
feedforward network and its associated traimng algonthms (like backprop) into the 
temporal domain In general it requires a recurrent architecture (nets with feedback 
connections) and proper associated learning algonthms Two such temporal networks 
discussed are Back Propagation Through Time (BPTT) and Time Delay Neural Networks 
(TDNN) 

3 2 Back-Propagation Through Time 

The back propagation through time (BPTT) algonthm for traimng a recurrent network is 
an extension of the standard back propagation algonthm 

It IS based on the fact that for every recurrent network there exits a feed forward 
network with identical behavior over a finite penod of time This behavior of a recurrent 
network can be achieved unfolding the temporal operation of the network into a 
multilayer feed forward network The topology of such a network grows by one layer at 
every time step 
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A comparative representation of a simple recurrent network and a back 
propagation thiough time network is shown in the figure 3 1 The constraint for the above 
network in fig (3 1) is that the weights at each level of the feed forward network should 
be the same The appropriate method for maintaining this constraint is to keep track of 
the changes dictated for each weight at each level and then change each of the weights 
according to the sum of these individually prescribed changes 





Fig 3 1 Comparison of the recurrent network and a feedforward network with identical 
behavior 

The general rule for deterimmng the change prescnbed for a weight m the system 
for a particular time is that of the simple back propagation algorithm given below 
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/ 



Fig 3 2 Back Propagation Through Time network 


3 21 Simple Back Propagation method for weight updation 

The weight updation used in the simple back propagation algorithm is as given below 

The mathematical equations governing the propagation of the error are as follows The 

schematic diagram of such a network is shown m Fig 3 3 

/ 

The error function or cost function that is to be rmmmized is 

£ = 05|;(d,-o,)“ (3 1) 

Jt=l 

The weights of the neuron whose desired outputs are known are updated as 

(3 2) 

The weights of the inner layer neurons whose outputs are not known are updated as 

where 

^oic=(^k-Ok)f (3 4) 
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Inputs 


=/,(»«, 

fc=l 

The derivative of the activation function is given by 
For Umpolai activation functions 

f inet) = o^(\-o^) (3 6) 

For Bipolar activation function 

f{net) = QS{\-o,f (S 7^ 



Fig 3 3 Feedforward Neural Network 
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3 2 2 Algorithm for BPTT 

Step 1 choose the number of inputs or the past history that is to be submitted to 
the network 

Step 2 An input is presented to the system with some imtiahsed weights with the 
constraint that all the corresponding weights for the network remains the same 
Step 3 The errors at all the neurons for which the desired output is known is calculated 
and added 

Step 4 Weights are computed for all the umts and the sum of all the weight changes 
dictated for a particular weight is saved 
Step 5 The weights are changed by the amount of the sum of changes 
Step 6 Go to step 1 until all the iterations are completed 

3 3 Time Delay Neural Networks 

Simple recurrent networks have been proven to be inadequate for several prediction 
tasks possibly because simple gradient procedures don t perform well m complex 
prediction tasks charactenzed by the existence of many local optima Now we can see 
one neural network model that have been successfully used for solvmg some problems in 
which values for a variable need to be predicted from past values for that vanable 

The generic network model shown in Fig 3 4 consists of preUminary 
preprocessing component that transforms an external input vector x(t) into x{t) 



Fig 3 4 Generic neural network model for prediction 

A preprocessed vector x(f} is supplied to a feed forward network The feed 
forward network is trained to compute the desired output values for a specific input x(t) 
we have to note that x and x may be of different dimensionality The preprocessing 
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component may implement a short term memory and the feed forward network is the 
predictor component following the terminology of a mozer 

Let us take a prediction task when x(t) was to predicted from x(t 1) x(t 2) In this 
simple case x at time t consists of a single input xft) and J" at time t consists of the 
vector (x(t-I) xft 2)) supplied as input to feed forward network For this example 
preprocessing consists of merely of stonng past values of the vanable and supplying 
them to the network along with the latest value Such a model is sometimes called 
Tapped Delay line Neural Network consisting of a sequence of delay umts or buffers and 
with the values of vanable at recent instants being supplied to the feed forward predictor 
component 

The architecture of this model is above in Fig 3 5 Algonthms such as back 
propagation and its vanants are used to train the weights in the network and is given m 
3 2 1 




xft) 

► 





> 


TIME DELAY NEURAL 
NETWORK PREDICTOR 


xft+\) 

► 


Fig 3 5 Time Delay Neural Network (TDNN) with two delay elements 

Several questions need to be answered before attempting to forecast with an 
MLEF How many samples are required? How many input data points should be used 
that is what window size is best? What prediction horizon should be chosen (how far into 
the future to predict) how should the test and traimng data be divided what network 
configuration should be used (number of hidden layers number of nodes per layer type 
of activation functions type of output nodes) and so on Perhaps the most difficult 
question of all to answei is what to predict For example if the senes is the Standard and 
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Poor (S&P) 500 stock index should the index value be forecast or should the direction of 
the senes be forecast? The questions are not easily answered Knowledge of the dynamics 
driving the system to be forecast can provide some suggestions on good network 
architectuie and network parameters Therefore if the user has tools to carry out 
statistical and chaotic data analyses it may be possible to charactenze the system and 
provide some insight into the best choice of network parameters In any case one should 
set realistic objectives on what is achievable and then expenment with different 
architectures and traimng data The general guidehnes for the construction of MLFF 
networks should also be followed if applicable but usually some experimentation will 
still be necessary Besides the tune senes itself other data or indicators may also be 
useful to be improving the accuracy of forecasting For example the use of movmg 
averages trading volume momentum and other relevant data can sometimes improve 
forecasting accuracy sigmficantly (Patterson et al 1993) 
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3 4 Simulation Results with BPTT 

Example 1 Change in Business Inventories 


The testing MSE obtained for different orders chosen is tabulated below 
This vaiiation of testing MSE with respect to the order chosen is plotted 
graphically 


Taps 

MSE 

2 

15 7178 

5 

15 8270 

10 

6 2372 

15 

7 8128 

28 

3 0232 

35 

1 1022 


Plot of the MSE Vs order of the model chosen 
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MSE 


Example! Save Rate data 


Tlie testing MSE obtained for different orders chosen is tabulated below 

This variation of testing MSE with respect to the order chosen is plotted 
giaphically 



10 15 20 25 30 35 40 45 50 

Order 
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Example 3 Simspots data 


The testing MSE obtained for different orders chosen is tabulated below 
This variation of testing MSE with respect to the order chosen is plotted 
graphically 


Taps 

MSE 

20 

97 0311 

40 

82 5108 

80 

58 3691 

100 

50 8420 

120 

46 3247 

140 

41 7216 


Plot of the MSE Vs order of the model chosen 
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3 5 Simulation Results with TDNN 
Example 1 Change m Business Inventories 


The testing MSB obtained for different orders chosen is tabulated below 
This variation of testing MSB with respect to the order chosen is plotted 
giaphically 


Taps 

MSE 

1 

10 072 

3 

2 922 

5 

2 308 

7 

1 813 

9 

1 584 


Plot of the MSE Vs order of the model chosen 
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Example 2 Save Rate data 


The testing MSE obtained for different orders chosen is tabulated below 
This vaiiation of testing MSE with respect to the order chosen is plotted 
graphically 


Taps 

MSE 

1 

0 456 

3 

0 391 

5 

0 348 

7 

0 186 

9 

0 143 


Plot of the MSE Vs order of the model chosen 





Example 3 Sunspots data 


The testing MSE obtained for different orders chosen is tabulated below This variation of 
testing MSE with respect to the order chosen is plotted graphically 



Order 
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Chapter 4 

Results and Discussions 


4 1 Intioduclion 

The results of the st rtisticrl and Neural Network methods are shown for three examples 
namely 

Example 1 Change in business Inventones data 
Example 2 Save late data 
Example 3 Sunspots data 

The compaiison of all these metliods are discussed at the end of the chapter 

4 2 EXAMPLE 1 Business Inventories 
4 21 About the d it i 

The data in this example is tlie change in business inventories stated at annual rates m 
billions of dollars The 60 observations cover the period from the first quarter of 1955 
through tlie fourth quaitei of 1969 



fig 4 1 The change mthe business rate data 
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Plot of the 
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42 2UBJ Model 

Training set 0~50 
Testing set 50 - 60 
Model aR(1) 



Tig 4 2 Training performance using UBJ model 



Fig 4 3 Testing performance using the UBJ model 




4 2 4 Time Del ly Neunl Network 


No ofitentions 50 000 Learning factor 0 000015 

No of hidden hyus 15 No oftaps 7 



Predicted Vs Actual 



Tig 4 7 Testing Perfoimance using TDNN 
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4 3 EXAMPLE 2 Save rate data 

4 3 1 About the data 


The saving lale is personal saving as a percent of disposable personal income Some 
economists believe shifts m this rate contribute to business fluctuations In this example 
100 quarteily observations of the U S saving rate for the years 1955 1979 is analyzed 


X axis saving rate 
y axis tinie(quarterly) 



Fig 4 8 The Save rate data 
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4 3 2 UB J Model 


Training set 0-85 
Testing set 86 - 104 
Model AR(2) 



Fig 4 9 Training Performance using UBJ model 


Testing MSE= 0 21924 



Fig 410 Testing Performance using UBJ model 
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4 3 3 Back Propagation Through Tune 


No of Iterations 
Learning rate 
No of inputs taken 



Fig 4 1 1 Training error (MSB) using BPTT 


Plot of the Fred Vs Actual 



Fig 4 12 Predicted Vs Actual using BPTT 


lOOOO 
0 005 
50 
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4 4 EXAMPLE 3 Sunspots data 

4 41 About the dati 

Sunspots are strong magnetic field areas on the surface of the sun They are high intensity 
electro magnetic flaies of solar radiation of largely unknown and unpredictable causes 
The exact oiigin and causes are not known They have major effects on vanous terrestrial 
phenomena for example long range weather prediction telecommumcations and 
interplanetary flight When a sunspot is in time with the earth radio signals may fade out 
teletype messages may be distributed Mysterious cycles are observed in sunspot data is a 
challenge to statisticians 
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4 4 2 UBJ Model 


Training set 0 285 
Testing set 286 - 295 
Model AR(1) 



Tig 4 16 The Training performance usmg UBJ method 


Testihg MSE = 51 4028 



Fig 4 17 Testing Performance using UBJ Method 
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4 4 3 Back Propagation Thi ough Time 


No of Iterations 2000 
Past history taken 120 
Final error 1 2746 

Learning factor 0 003 

Plot of Training Error 



Fig Training error(MSE) using BPTT 


Plot of the P red Vs Actual 



I ig Predicted V s Actual using BPTT 
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4 4 4 Time DeHy Neui nl Network 


No ol itenlions 60000 Learning rate 00000016 

1*5 Hidden neurons 20 



0 1 2 3 4 5 6 

Iterall ns 


Fig 4 20 Training Error using TDNN 



1 ig 4 21 Testing Performance using TDNN 
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4 5 Compaiison of the Methods used 

Fiom the simulation results shown above the following comparisons can be made 
regal ding each method The advantages and the limitations of each method can also be 
done The pai ametei s tint ai e considered for the comparison of the three forecasting 
methods are 

a) The accui acy of foi ecastmg 

b) The number of iterations needed for training 

c) The amount of past history needed to model it 


Example 1 



UBJ 

BPTT 

TDNN 

No of iterations 

— — 

10 000 

50 000 

No of past histones 

1 

28 

7 

Testing MSE 

2 7413 

3 0232 



1 8129 


Example 2 



UBJ 

BPTT 

TDNN 

No of Iterations 


10 000 

50 000 

No of past histones 

2 

50 

7 

Testing MSE 

0 21924 

0 2314 

0 1859 


Example 3 



UBJ ' 

BPTT 

TDNN 

No of iterations 



60 000 

No of past histones 

1 

120 

20 

Testing MSE 

51 4028 

46 3247 

65 2865 
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ARIMA models require less past history compared to the other models The accuracy of 
forecasting is satisfactory 

BPTT models conveige after large number of iterations when compared to the Time 
Delay Neural Networks The accuracy of prediction of the BPTT is poor when compared 
with that ot the other two models Back Propagation Through Time needs large amounts 
of memory to store the intermediate weights when the data is large 

TDNN models take less number of iterations when compared with the BPTT Its 
accuracy of prediction is bettei than BPTT and in comparison with that of ARIMA 
models 
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Chayter 5 


CONCLUSIONS AND FUTURE 
SCOPE OF WORK 


5 1 Conclusions 

The conclusions based on tlie lesults obtained are given below 

• 1 oiecTSting lesulls aie almost same foi both the approaches if the tune series to be 
modelled is not veiy chaotic and not veiy laige 

• Chaotic time series can be modelled moie precisely by Neural Networks 

• Modelling a time series using Neui al Network mehiods is more tedious and takes a lot 
of time foi choosing the appiopiiate model 

5 1 Scope for Future Work Scope 

• Techniques such as fuzzy clusteimg can be used for pre processmg of the data befoie 
doing the analysis This impioves the overall accuracy of forecastmg 

• The Neuial Network model chosen can be optimally designed by employmg pruning 
and optimisation methods 

V 

. EconomeU.c methods esn be developed osmg Neural Networks Multivanable analysis 
should be done foi this puipose 
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